Generalizing continuous-space translation of paralinguistic information
نویسندگان
چکیده
In previous work, we proposed a model for speech-to-speech translation that is sensitive to paralinguistic information such as duration and power of spoken words [1]. This model uses linear regression to map source acoustic features to target acoustic features directly and in continuous space. However, while the model is effective, it faces scalability issues as a single model must be trained for every word, which makes it difficult to generalize to words for which we do not have parallel speech. In this work we first demonstrate that simply training a linear regression model on all words is not sufficient to express paralinguistic translation. We next describe a neural network model that has sufficient expressive power to perform paralinguistic translation with a single model. We evaluate the proposed method on a digit translation task and show that we achieve similar results with a single neural network-based model as previous work did using word-dependent models.
منابع مشابه
A method for translation of paralinguistic information
This paper is concerned with speech-to-speech translation that is sensitive to paralinguistic information. From themany different possible paralinguistic features to handle, in this paper we chose duration and power as a first step, proposing a method that can translate these features from input speech to the output speech in continuous space. This is done in a simple and language-independent f...
متن کاملs-Topological vector spaces
In this paper, we have dened and studied a generalized form of topological vector spaces called s-topological vector spaces. s-topological vector spaces are dened by using semi-open sets and semi-continuity in the sense of Levine. Along with other results, it is proved that every s-topological vector space is generalized homogeneous space. Every open subspace of an s-topological vector space is...
متن کاملAn Investigation of the Linguistic, Paralinguistic and Sociocultural Effects of Input on the Perception and Translation of Gerunds by Persian Speakers of English
In this study, it was intended to investigate the Persian native speakers’ perception of gerunds by three different elicitation techniques i.e., written, audio, and pictorial through translation. Eighty intermediate learners of English were asked to select Persian translation of the gerund formsin these elicitation techniques. They were asked to choose one option from a pair of written first la...
متن کاملA Study of the Relationship between Acoustic Features of “bæle” and the Paralinguistic Information
Language users benefit from special phonetic tools in order to communicate linguistic information as well as different emotional aspects and paralinguistic information through daily conversation. Having functions in conveying semantic information to listeners, prosodic features form the essential part of linguistic behavour, manipulating them potentially can play an important role in transmitt...
متن کاملLefschetz theorem for valuations , complex integral geometry , and unitarily invariant valuations
We obtain new general results on the structure of the space of translation invariant continuous valuations on convex sets (a version of the hard Lefschetz theorem). Using these and our previous results we obtain explicit characterization of unitarily invariant translation invariant continuous valuations. It implies new integral geometric formulas for real submanifolds in Hermitian spaces genera...
متن کامل